Static Memory Access Pattern Analysis on a Massively Parallel GPU

نویسندگان

  • Byunghyun Jang
  • Dana Schaa
  • Perhaad Mistry
  • David Kaeli
چکیده

The performance of data-parallel processing can be highly sensitive to any contention in memory. In contrast to multi-core CPUs which employ a number of memory latency minimization techniques such as multi-level caching and prefetching, Graphics Processing Units (GPUs) require that the data-parallel computations reference memory in a deterministic pattern in order to reap the benefits of these many-core platforms. Memory access sensitivity is primarily due to the Massively Parallel Processing (MPP) execution model and underlying memory hardware architecture of GPUs which are specifically tuned for graphics rendering [2, 4]. In this paper we present a static memory access pattern analysis model that provides guidance on how best to apply a wide range of memory optimizations on GPUs. Our analysis carefully takes into account the mapping of threads to data, a critical factor when attempting to exploit the full capabilities of current GPU architectures. We formulate a methodology that allows us to build tools to guide programmers on how best to apply algorithmic memory optimizations and can easily be integrated into a pass of a compiler. We demonstrate the power of our analysis model by showing a case study of a matrix multiplication implementation using the OpenCL programming language on NVIDIA G80 and G200 series GPUs which have slightly different memory architectures.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Trie Compression for GPU Accelerated Multi-Pattern Matching

Graphics Processing Units (GPU) allow for running massively parallel applications offloading the Central Processing Unit (CPU) from computationally intensive resources. However GPUs have a limited amount of memory. In this paper, a trie compression algorithm for massively parallel pattern matching is presented demonstrating 85% less space requirements than the original highly efficient parallel...

متن کامل

Parallel multi-dimensional range query processing with R-trees on GPU

The general purpose computing on graphics processing unit (GP-GPU) has emerged as a new cost effective parallel computing paradigm in high performance computing research that enables large amount of data to be processed in parallel. Large scale scientific data intensive applications have been playing an important role in modern high performance computing research. A common access pattern into s...

متن کامل

Effect of Instruction Fetch and Memory Scheduling on GPU Performance

GPUs are massively multithreaded architectures designed to exploit data level parallelism in applications. Instruction fetch and memory system are two key components in the design of a GPU. In this paper we study the effect of fetch policy and memory system on the performance of a GPU kernel. We vary the fetch and memory scheduling policies and analyze the performance of GPU kernels. As part of...

متن کامل

Massively Parallel Genetic Algorithm – Pattern Search for Nonlinear Optimization with GPU Computing

This paper presents a massively parallel Genetic Algorithm – Pattern Search (GA-PS) with graphics hardware acceleration on bound constrained nonlinear optimization problems. The objective of this study is to determine the effectiveness of using Graphics Processing Units (GPU) as a hardware platform for Genetic Algorithms (GA). The global search of the GA is enhanced by a local Pattern Search (P...

متن کامل

GPU Acceleration of Particle Advection Workloads in a Parallel, Distributed Memory Setting

Although there has been significant research in GPU acceleration, both of parallel simulation codes (i.e., GPGPU) and of single GPU visualization and analysis algorithms, there has been relatively little research devoted to visualization and analysis algorithms on GPU clusters. This oversight is significant: parallel visualization and analysis algorithms have markedly different characteristics ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010